ES-LDA: Entity Summarization using Knowledge-based Topic Modeling

نویسندگان

  • Seyed Amin Pouriyeh
  • Mehdi Allahyari
  • Krzysztof Kochut
  • Gong Cheng
  • Hamid R. Arabnia
چکیده

With the advent of the Internet, the amount of Semantic Web documents that describe real-world entities and their inter-links as a set of statements have grown considerably. These descriptions are usually lengthy, which makes the utilization of the underlying entities a difficult task. Entity summarization, which aims to create summaries for real world entities, has gained increasing attention in recent years. In this paper, we propose a probabilistic topic model, ES-LDA, that combines prior knowledge with statistical learning techniques within a single framework to create more reliable and representative summaries for entities. We demonstrate the effectiveness of our approach by conducting extensive experiments and show that our model outperforms the state-of-the-art techniques and enhances the quality of the

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The THU Summarization Systems at TAC 2010

The TAC 2010 Guided Summarization task requires participants to generate coherent summaries with the guidance of predefined categories and aspects. In this paper, we present our two extractive summarization systems. In the first system, we employ a topic model Labeled LDA to model the aspects. The correspondence between the aspects and the topics in Labeled LDA is established through identifyin...

متن کامل

Multi Domain Semantic Information Retrieval Based on Topic Model

Over the last decades, there have been remarkable shifts in the area of Information Retrieval (IR) as huge amount of information is increasingly accumulated on the Web. The gigantic information explosion increases the need for discovering new tools that retrieve meaningful knowledge from various complex information sources. Thus, techniques primarily used to search and extract important informa...

متن کامل

Using Latent Dirichlet Allocation to Incorporate Domain Knowledge with Concept based Approach for Automatic Topic Detection

In the past couple of years multi-topic summarization is a research investigation that has expanded much attention. There has been a variety of effort on generating natural language summaries for variety of topics, but this is feasible only for a very small number of topics. In this research paper the method trying to provide automatic detection of topics to be summarized that is can determine ...

متن کامل

Comparative Summarization via Latent Dirichlet Allocation

This paper aims to explore the possibility of using Latent Dirichlet Allocation (LDA) for multi-document comparative summarization which detects the main differences in documents. The first two sections of this paper focus on the definition of comparative summarization and a brief explanation of using the LDA topic model in this context. In the last three sections, our novel method for multi-do...

متن کامل

Summarization of Corporate Risk Factor Disclosure through Topic Modeling

In this paper, we propose a novel problem of summarizing textual corporate risk factor disclosure, which aims to simultaneously infer the risk types across corpus and assign each risk factor to its most probable risk type. To solve the problem, we develop a variation of LDA topic model called Sent-LDA. The variational EM learning algorithm, which guarantees fast convergence, is derived and impl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017